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METHODS AND APPARATUS FOR KEYSTREAM GENERATION 

This application claims the benefit of U.S. 
provisional patent application no. 60/167,185 filed on 
November 23, 1999. 



The present invention relates generally to the 
fields of security and cryptography. Methods and 
apparatus for the generation of a random cryptographic 
one way function (i.e. a keystream) for use in 

10 encrypting or decrypting binary data are provided. 

Binary data may comprise digitized video or audio, as 
well as Internet Protocol or other pure data services. 
Although the invention has particular applicability in 
controlling access to premium services in cable and 

15 satellite television systems, it is also applicable to 
the protection of other information which is 
communicated in electronic form. 

More particularly, the present invention provides 
a non-linear keystream generation algorithm using 

20 multiple feedback shift registers. The feedback shift 
registers may be constructed utilizing an advanced 
mathematical construct called an extended Galois Field 
GF(2 m ) . The keystream is generated as a non-linear 
function of the outputs of the multiple feedback shift 

25 registers, which may be a combination of static 
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feedback shift registers and dynamic feedback shift 
registers. Dense primitive polynomials with many 
coefficients may be used to produce a cryptographically 
robust keystream for use as an encryption or decryption 
5 key. 

One way that binary data may be scrambled 
(encrypted) is by processing the binary data with a 
keystream (cryptographic key) to produce ciphertext 
(encrypted data) . Keystreams are based on a sequence of 

10 bits that can be generated by pseudorandom sequences. 

The ciphertext can then be decrypted using an identical 
keystream. Data content owners want technology used in 
copy protection key derivation to be unique and 
difficult to duplicate. To achieve this end, binary 

15 data can be processed through a hash function by using 
that data as the input to a cryptographic One Way Hash 
Function, and by using the output of that Function as 
an encryption key for other binary data. 



2 0 describes a dynamic feedback arrangement scrambling 

technique (DFAST) keystream generator. DFAST utilizes a 
dynamic feedback shift register, the structure of which 
is varied by a polynomial code signal. The polynomial 
code signal is varied in accordance with the content of 

25 data bits shifted from a predetermined register stage 
of the feedback shift register. 
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It would be advantageous to provide for generating 
a keystream having enhanced cryptographic and ease -of - 
implementation features as compared to those provided 
by DFAST and other prior art keystream generators. 
5 It would be further advantageous to provide for 

keystream generation utilizing multiple feedback shift 
registers constructed using an extended Galois Field GF 
(2 m ) . Extended Galois Field mathematics is well-suited 
to implementation in software systems. It would be 

10 still further advatageous to provide for keystream 

generation where the structure of the feedback shift 
registers are also varied by a polynomial code signal. 
It would be even further advantageous to provide for 
keystream generation using several randomization (e.g., 

15 permutation) stages to combine data bits from 

predetermined register stages of the feedback shift 
registers in a non- linear manner. 

The present invention provides the foregoing and 
other advantages. More specifically, the present 

20 invention provides for an extension and improvement of 
the DFAST technology described in commonly assigned 
U.S. patent no. 4,860,353. An improved dynamic feedback 
arrangement scrambling technique in accordance with the 
invention (sometimes designated herein as "DFAST2"), 

25 provides a keystream with enhanced cryptographic and 
ease -of -implementation features as compared to DFAST. 
The present invention is suitable for use with a cable 
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television system or hosts with point of deployment 
(POD) capability. The present invention is particulary 
suited for use with OpenCable™ set top boxes and PODs 
developed by Cable Televison Laboratories, Inc. 
5 (CableLabs™) of Colorado, USA, and is incorporated into 
the January 2000 POD Copy Protection Standard. 
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SUMMARY OF THE INVENTION 

The present invention provides methods and 
apparatus for the generation of a cryptographic one way 
function that produces an output that appears random 
5 for use in encrypting or decrypting binary data. More 
particularly, the present invention provides a non- 
linear key generation algorithm using multiple feedback 
shift registers. A keystream is typically a long series 
of randomized bits used to encrypt or decrypt a stream 

10 of data using the exclusive OR operation. A keystream 
generator is the source of these bits, and the first 
set of bits output from this keystream generator can be 
used as an actual encryption key of determinate size. 
The term "key" can thus mean a fixed set of bits (e.g. 

15 56 bits) output from a keystream generator. The 

feedback shift registers may be constructed utilizing 
an advanced mathematical construct called an extended 
Galois Field GF(2 m ) . The key is generated as a non- 
linear function of the outputs of the multiple feedback 

20 shift registers, which may be a combination of static 
feedback shift registers and dynamic feedback shift 
registers. Dense primitive polynomials with many 
coefficients may be used to produce a cryptographically 
robust key for use as an encryption or decryption key. 

25 In an illustrative embodiment of the invention, a 

key is generated using multiple feedback shift 
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registers. Each feedback shift register has input, 
intermediate, and output stages through which data bits 
are shifted serially in response to a clock signal. 
Data bits from predetermined register stages of 
5 multiple feedback shift registers are provided to a 
first randomization stage. The output of the first 
randomization stage is provided to a second 
randomization stage. The output of the second 
randomization stage and data bits from other 

10 predetermined register stages of the feedback shift 
registers are provided to at least one additional 
randomization stage. The data bits are permutated at 
each randomization stage and the final randomization 
stage output provides the keystream. The last 56 bits 

15 of keystream may be used as the final key. 

In a preferred embodiment, a third randomization 
stage may be the final randomization stage. 

The structure of at least one of the feedback 
shift registers may be varied in response to a 

2 0 polynomial code signal generated by a polynomial code 

signal generator. In addition, the polynomial code used 
to generate the polynomial code signal may be varied. 

The feedback shift registers may comprise a 
plurality of dynamic feedback shift registers and at 

25 least one static feedback shift register. 

In a preferred embodiment, the feedback shift 
registers may comprise a first dynamic feedback shift 
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register, a second dynamic feedback shift register, and 
a static feedback shift register. Seed data may be 
input into an input buffer. A first portion of the seed 
data from the input buffer may be provided to the first 
5 dynamic feedback shift register and a second portion of 
the seed data from the input buffer may be provided to 
the second dynamic feedback shift register. A third 
portion of the seed data from the input buffer may be 
provided to the static feedback shift register. 

10 The data bits of the dynamic feedback shift 

registers may be shifted serially from each register 
stage in response to a clock signal. A number of finite 
field adders may be arranged between predetermined 
pairs of the register stages of the dynamic feedback 

15 shift registers, such that one of the inputs to each 

adder is provided from the preceding register stage and 
the other input of each adder is fed back from the 
output terminal of the output stage via a finite field 
multiplier. 

20 The first randomization stage may comprise a 

randomization table (e.g., a permutation table) for 
permuting data bits from predetermined register stages. 
The second randomization stage may comprise a non- 
linear mixing function to combine the output of the 

25 second randomization stage and data bits from other 
predetermined register stages of the feedback shift 
registers. The third randomization stage may comprise 
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multiple non-linear S-Boxes. The S-Boxes may be 8*8 S- 
Boxes. The third randomization stage may comprise 256 
non-linear 8*8 S-Boxes. 

In a preferred embodiment, the feedback shift 
5 registers may be constructed using an extended Galois 
field (GF (2 m ) ) . For example, the extended Galois field 
(GF(2^8)) may be used. The polynomials used for the 
feedback shift registers may be primitive and 
irreducible to assure maximum length output sequences. 

10 In an alternate embodiment, the first 

randomization stage may comprise multiple randomization 
tables for permuting data bits from predetermined 
register stages. For example, eight randomization 
tables may be provided in the first randomization 

15 stage. The data bits from the feedback shift registers 
may be multiplexed prior to input into the first 
randomization stage. 

In a further embodiment, the output from the third 
randomization stage may be provided to a pre-keystream 

20 register. Alternate bits of the output from said pre- 
keystream register may be provided to a select chain 
buffer. The output from the select chain buffer may be 
provided to a decoding logic unit, which decoding logic 
unit decodes a particular polynomial for use in 

25 generating a polynomial code signal, said polynomial 
code signal being provided to at least one of the 
feedback shift registers. The remaining bits of the 
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output from said pre-keystream register may be provided 
to a keystream register. The output of the keystream 
register provides the keystream. The third 
randomization stage may comprise multiple non-linear S- 
5 Boxes . 

Certain output of the S -Boxes may be provided to a 
codestream register. This output may be clocked through 
the codestream register and added to data bits shifted 
from at least one of the feedback shift registers via a 

10 non-binary adder to produce feedback data bits. The 

feedback data bits may be provided to the input stage 
and predetermined intermediate stages of at least one 
of the feedback shift registers. 

Use of a 192 bit shared seed input is suggested, 

15 but other input key lengths can also be used. Use of a 
192 bit input will generate a 56 bit key from the 
output keystream. The 192 bit input may be derived from 
a 128 bit input by duplicating half the bits. 

The use of Galois Field 2 8 operations is well- 

20 suited for implementation on CPUs that use 8, 16, or 32 
bit instruction widths. Although the invention is 
described for implementation in software, it can easily 
be adapted for implementation in hardware. 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows a block diagram representation of the 
present invention; 

Figure 2 shows an illustrative embodiment of the 
5 present invention; 

Figure 3 shows an alternate illustrative embodiment of 
the present invention; 

Figure 4 shows an implementation of a static feedback 
shift register in accordance with the present 
10 invention; 

Figure 5 shows an implementation of a dynamic feedback 
shift register in accordance with the present 
invention; and 

Figure 6 shows a further implementation of a dynamic 
15 feedback shift register in accordance with the present 
invention . 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides methods and 
apparatus for key generation using multiple feedback 
shift registers. The feedback shift registers may be 
5 constructed utilizing an advanced mathematical 

construct called an extended Galois Field GF(2 m ) . The 
key is generated as a non-linear function of the 
outputs of the multiple feedback shift registers, which 
may be a combination of static feedback shift registers 

10 and dynamic feedback shift registers. Dense primitive 
polynomials with many coefficients may be used to 
produce a cryptographically robust keystream for use as 
an encryption or decryption key. 

The keystream may be pseudorandomly generated from 

15 a shared seed key, with the intent that the keystream 
appears random to a computationally bounded adversary. 
The shared seed key value may be derived, e.g., from a 
host authentication and key exchange process. Use of a 
192 bit shared seed input is suggested, but other input 

20 key lengths can also be used. Use of a 192 bit input 

will generate a 56 bit keystream. The 192 bit input may 
be derived from a 12 8 bit input by duplicating half the 
bits . 

In an illustrative embodiment of the invention as 
25 shown in Figure 1, a keystream 100 is generated using 
multiple feedback shift registers. Each feedback shift 
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register has input, intermediate, and output stages 
through which data bits are shifted serially in 
response to a clock signal. Data bits 10 from 
predetermined register stages of multiple feedback 
5 shift registers are provided to a first randomization 
stage 20. The output of the first randomization stage 
20 is provided to a second randomization stage 30. The 
output of the second randomization stage 3 0 and data 
bits 12 from other predetermined register stages of the 
10 feedback shift registers are provided to at least one 
additional randomization stage. The data bits are 
permutated at each randomization stage and the output 
of a final randomization stage 40 provides the 
keystream 100. 

15 In Figure 1, three randomization stages 20, 30, 

and 40 are shown, with the third randomization stage 
being the final randomization stage 40. However, those 
skilled in the art will appreciate that any number of 
randomization stages may be provided, depending on the 

20 implementation and the cryptographic integrity of the 
keystream 100 desired. 

The structure of at least one of the feedback 
shift registers may be varied in response to a 
polynomial code signal 90 generated by a polynomial 

25 code signal generator (e.g., decoding logic unit 80). 
In addition, the polynomial code used to generate the 
polynomial code signal 90 may be varied. 
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The feedback shift registers (FSR) may comprise a 
plurality of dynamic feedback shift registers (DFSR) 
and at least one static feedback shift register (SFSR) . 
The two DFSRs may be arranged in parallel. The 
5 keystream 100 is generated as a nonlinear function of 
the outputs of both the SFSR and the DFSRs. The DSFRs 
are dynamic shift registers whose structure is varied 
by the polynomial code signal 90. The polynomial code 
signal 90 may itself be varied in accordance with data 

10 bits shifted from a predetermined register stage. It is 
substantially more difficult to duplicate the keystream 
100 if the polynomial code varies. The bits that cause 
the polynomial code 90 to vary are preferably not 
included in the keystream 100. 

15 In a preferred embodiment as shown in Figure 2, 

the feedback shift registers may comprise a static 
feedback shift register A (generally designated 110) , a 
first dynamic feedback shift register B (generally 
designated 120) , and a second dynamic feedback shift 

20 register C (generally designated 130) . Seed data may be 
input into an input buffer 140. The contents of the 
input buffer 140 may be loaded in parallel into the 
shift registers 110, 120, and 130 to initialize 
keystream generation. A first portion of the seed data 

25 from the input buffer 140 may be provided to the first 
dynamic feedback shift register B and a second portion 
of the seed data from the input buffer may be provided 
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to the second dynamic feedback shift register C. A 
third portion of the seed data from the input buffer 
may be provided to the static feedback shift register 
A. 

5 In the embodiment shown in Figure 2 , the feedback 

shift registers 110, 120, and 130 (registers A, B, and 
C respectively) may be constructed using an extended 
Galois field (GF (2 m ) ) . As well known in the art, a 
Galois field is a mathematical field extension obtained 

10 by considering the coefficients and roots of a given 

polynomial. The polynomials used for the feedback shift 
registers in the present invention may be primitive and 
irreducible. Each such shift register would generate a 
sequence with the maximal length max_length=2 A (m*L) -1 

15 (where L is the number of shift register stages) . 

In the Figures, RGA denotes output from a stage of 
register A (110), RGB denotes output from a stage of 
register B (120), and RGC denotes output from a stage 
of register C (13 0) . For example, Figure 2 shows 

20 register A having output from L register stages (RGA 0 , 
RGA!, RGA 2 , . . . RGA L _i) . 

The data bits of the feedback shift registers 110, 
120, and 130 may be shifted serially from each register 
stage in response to a clock signal. A number of finite 

25 field adders may be arranged between predetermined 
pairs of the register stages of the feedback shift 
registers 110, 120, and 130, such that one of the 
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inputs to each adder is provided from the preceding 
register stage and the other input of each adder is fed 
back from the output terminal of the output stage via a 
finite field multiplier. An example of the structure of 
5 each feedback shift register 110, 120, and 130 of 
Figure 2 is described in detail below in connection 
with Figures 4, 5, and 6. It is noted that the 
invention may utilize three or more stages of non- 
linear functions to combine data bits shifted from 
10 predetermined register stages of the feedback shift 
registers . 

The first randomization stage (e.g., first 
randomization stage 20 of Figure 1) may comprise one or 
more randomization tables (e.g., "Fl tables" 150-157) 

15 for permuting data bits from predetermined register 
stages. In the example shown in Figure 2, eight Fl 
tables 150-157 are used. The eight Fl tables are shown 
for purposes of example only, and those skilled in the 
art will appreciate that the invention can be 

20 implemented using any number of Fl tables (or similar 
non-linear permutation functions) in the first 
randomization stage. The Fl tables 150-157 may be fixed 
substitution tables used to map data bits shifted from 
various feedback shift registers to a different value. 

25 The inputs to each Fl table may be separated by at 

least one exclusive-or gate in their source feedback 
shift register structure. 
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The data bits from selected register stages of the 
feedback shift registers may be multiplexed (e.g., at 
multiplexers 170-173) prior to input into the Fl tables 
150-157. Selection by the multiplexers 170-173 between 
5 data bits shifted from different predetermined register 
stages is controlled in response to a data bit shifted 
from a different predetermined register stage. For 
example, in Figure 2, two output bits (RGAj and RGAi) 
are shifted directly from different stages of register 

10 A to multiplexer 170. One of RGAj and RGAi is selected 
by multiplexer 170 in response to a data bit shifted 
from register C (RGC k ) . Similar operations take place 
at multiplexers 171, 172, and 173. The output of 
multiplexers 170, 171, 172, and 173 is provided to Fl 

15 tables 150, 153, 154, and 157 respectively. 

Fl table 151 receives input directly from a stage 
of register A (RGAJ . Fl table 152 receives input 
directly from a stage of register B (RGB P ) . Fl table 
155 receives input directly from a stage of register A 

20 (RGAnun) . Fl table 156 receives input directly from a 
stage of register B (RGB PP ) . 

The second randomization stage (e.g., second 
randomization stage 3 0 of Figure 1) may comprise one or 
more non-linear mixing functions (e.g., 161, 162) to 

25 combine the output of the second randomization stage 

and data bits from other predetermined register stages 
of the feedback shift registers. In the example shown 
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in Figure 2, two non-linear mixing functions 161, 162 
are used for purposes of example only. Those skilled in 
the art will appreciate that the invention can be 
implemented using any number of non- linear mixing 
5 functions (or similar non-linear permutation functions) 
in the second randomization stage. 

* The non-linear logic used by the non-linear 
mixing functions 161, 162 combines at least 4 tap 
positions, some of which are from the Fl table outputs, 

10 and several others directly from predetermined FSR 

stages. For example, Figure 2 shows non-linear mixing 
function 161 as having input RGB' i* / j * from Fl table 153, 
input RGB'p from Fl table 152, input RGA' m from Fl table 
151, and input RGA'i/j from Fl table 150, as well as 

15 inputs RGAq' and RGB q directly from a stage of register 
A and a stage of register B respectively. In such an 
instance, the non- linear mix function may be gO = 
RGA'i/j ® RGB' i< /j/ 0 RGAq* ® RGB' P © RGA' m <8> RGB q . In 
this equation, <8> represents the field element 

20 multiplication operation, and © represents the field 
element addition operation (bit -wise XOR) . 

Non- linear mix function 162 is shown as having 
input RGA'ii/jj from Fl table 154, input RGA'™ from Fl 
table 155, input RGB' PP from Fl table 156 and input 

25 RGB'u./jj' from Fl table 157, as well as inputs RGA n , and 
RGB n directly from a stage of register A and a stage of 
register B respectively. These inputs may be combined 
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at the non- linear mix function 162 by function gl = 
RGA'ii/jj <8> RGB'ii./jj' 0 RGA n ' ® RGB' PP 0 RGA'^® RGB n . 

The third randomization stage (e.g., Final 
Randomization Stage 4 0 of Figure 1) may comprise 
5 multiple non-linear substitution boxes (S-Boxes) 165. 
The S-Boxes 165 may be 8*8 S-Boxes. 

In a preferred embodiment, the third randomization 
stage may comprise 256 non-linear dynamic 8*8 S-boxes 
165 (e.g., S 0 , Si, S 2 , . . .S255). A loop counter is 

10 provided to count n (number of FSR cycles) to ensure 
that every element in the dynamic S-Box 165 changes 
during operation. At least one predetermined stage of 
two of the extended Galois field shift register 
structures directly provides a randomized byte of S-Box 

15 address signals (shown as RGB X and RGA y input into S-Box 
RAM 166) . The already highly randomized outputs (gO and 
gl) from the non- linear mixing functions 161, 162 are 
added to the nth element of the S-Box 165, and to the 
element of S-Box 165 indexed by RGB x /RGA y directly from 

20 the FSR structures. This generates highly randomized 
outputs Y0' and Yl' . The S-box elements indexed by n 
and YO'/Yl' are then swapped so that every element in 
(S 0 , Si, . . ., S255) is affected during the first 256 
cycles of FSR shifting operations. 

25 In a further embodiment, the output (pre-keystream 

45) from the third randomization stage (e.g., S-Box 
165) may be provided to a pre-keystream register 50. 
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Alternate bits of the output from said pre-keystream 
register 50 may be provided to a select chain buffer 
70. The output from the select chain buffer 70 may be 
provided to a decoding logic unit 80, which decoding 
5 logic unit 80 decodes a particular polynomial for use 
in generating a polynomial code signal 90, said 
polynomial code signal 90 being provided to at least 
one of the dynamic feedback shift registers 120, 130. 
The remaining bits of the output from said pre- 
^ 10 keystream register 50 may be provided to a keystream 
3 register 60. The output of the keystream register 60 

I provides the keystream 100. 

4 z The pre-keystream 45 may be clocked into the pre- 

n keystream register 50 at the system clock rate. 

15 Alternate bytes of the pre-keystream 45 may be clocked 
* at one-half the system clock rate into the select chain 

y buffer 70 by an inverted CLOCK/2 signal. The remaining 

bytes of the pre-keystream 45 may be clocked at one- 
3 half the system clock rate into the keystream register 

20 60 by an uninverted CLOCK/2 signal. This avoids placing 
contiguous bytes of the pre-keystream 45 into the 
keystream 100. This also assures that keystream bytes 
are not used as signals to control selection of the 
polynomial code signals 90. The pre-keystream register 
25 50 is the final output creation stage and produces 8 
bits of output data per clock cycle of uninverted 
CLOCK/ 2 . 
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The decoding logic unit 80 uses the bits stored in 
the select chain buffer 70 to decode a particular 
polynomial, setting a signal corresponding to a 
particular polynomial true while setting all other 
5 polynomial signals false. 

The decoding logic unit 80 may use a fixed byte 
substitution table (F2 table 81) to provide up to 2 m 
different polynomial code signals 90. The F2 table 81 
may be a fixed substitution table similar to the Fl 

10 tables 150-157. 

Certain output of the S-Boxes 165 may be provided 
to a codestream register 175. This output may be 
clocked through the codestream register 175 (codestream 
48) and added to data bits shifted from at least one of 

15 the feedback shift registers 120, 130 via a non-binary 
adder to produce feedback data bits. The feedback data 
bits may be provided to the input stage and 
predetermined intermediate stages of at least one of 
the feedback shift registers 120, 130. The codestream 

20 48 is distinct from the pre-keystream 45 and only 

affects the dynamic FSR structures (120 and 130) and 
not the static FSR structure (110) . 

Figure 3 illustrates a specific configuration of 
the Figure 2 embodiment of the invention where the 

25 polynomials used for the FSRs are constructed over 

extended Galois Field GF(2 X 8), and all polynomials are 
selected to be primitive and irreducible in order to 
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generate a maximum length sequence. The degree of these 
polynomials is 8 so there will be 8 stages of 8 -tuple 
shift register cells used in each FSR. Since the 
sparseness of the terms for the polynomial can be a 
5 source of cryptographic weakness, dense primitive 

polynomials with many coefficients are used. The output 
from each stage of each FSR may have 8 bits (one byte) . 

The three FSRs 110, 120, and 130 require 192 total 
bits (18 bytes) of information to initialize them. 

10 These 192 bits are drawn from 128 bits of input, with 
half the 128 bits duplicated to provide 192 bits of 
input to input buffer 140. Input buffer 140 provides 
64 bits of input to each FSR 110, 120, and 130. 

As shown in the table below, a 128 bit (16 byte) 

15 input is used to initialize 192 bits (18 bytes) of FSR 
state. The input bytes are numbered from 0 to 15, and 
are used to initialize the three FSRs A, B, and C. 
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Table 1: Mapping Input to FSRs A, B, and C 
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A 128 bit input data seed is stored in the input 
buffer 140. All of the bits of the input seed are 
5 loaded in parallel from the input buffer into dynamic 
FSR structures B and C. The other third of the input 
(bytes 0-3 and 8-11) is loaded in parallel into the 
static FSR structure A. 

As each FSR cpnstructed in the extended Galois 
10 Field GF(2^8) has 8 stages, input data for each stage 
should be 64 bits in size (since the actual input size 
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is 128 bits, the above processing step is needed to 
reduce the information from 192 to 128 bits in size.) 

As discussed above in connection with Figure 2, 
each pre-keystream byte 45 is generated from a 
5 combination of a three-stage non-linear function 
process. In the first randomization stage, certain 
predetermined data bytes shifted from two separate 
feedback shift registers (A and B) are permutated by 
the Fl tables 150-157. In the second randomization 

10 stage, non-linear functions 161, 162 are used to mix 
data bytes from different Fl tables as well as data 
bytes directly from predetermined stages of registers A 
and B. In a third randomization stage, an 8*8 S-box 165 
(initialized to S 0 =0, Si = 1, . . . S 25 5 = 255) is used 

15 to permute the outputs from the second- level non- linear 
function, along with 2 bytes of data from registers A 
and B. 

Static FSR structure 110 (register A) and two 
dynamic FSR structures 120 and 130 (registers B and C) 

20 are shown in Figure 4, Figure 5, and Figure 6 

respectively. Each of these structures includes L 
stages, where L = 8, with stage 0 being the input 
stage, stage 7 being the output stage, and stage 1 
through 6 being the intermediate stages. All the 

25 polynomials used in the explanations herein are 
selected for illustration purposes. It should be 
appreciated that various other polynomials may be used 
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to implement the invention. 

All the FSRs are constructed over the extended 
Galois Field, GF (2*8). The elements in GF(2*8) are 
generated by: 

5 F(X) = X A 8 + X*4 + X*3 + X*2 +1= [10001110 1] 

Where X is a primitive element of GF(2*8) . Addition and 
multiplication operations within GF(2*8) are also 
illustrated in Figure 4 at 117. 

In the static feedback register 110 (structure A) 

10 shown in Figure 4, data bits are shifted serially from 
each stage in response to a clock signal applied to the 
CLK terminal. Data bytes shifted from the register 
stages are provided respectively at the output 
terminals RGA 0 through RGA 7 . A number of finite field 

15 adders 111-116 are respectively located between 

predetermined pairs of register stages. In Figure 4, 
finite field adders are located between Stage 0 and 
Stage 1 (111) , Stage 1 and Stage 2 (112) , Stage 3 and 
Stage 4 (113), Stage 4 and Stage 5 (114), Stage 5 and 

20 Stage 6 (115), and Stage 6 and Stage 7 (116). No finite 
field adder is located between Stages 2 and 3. One of 
the inputs to each adder 111-116 is provided from the 
preceding register stage and the other input to each 
adder is fed back from output terminal RGA 7 of the 

25 eighth output stage via finite field multipliers 91-97 
respectively. 
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The function of a non-binary adder (XOR) and a 
non-binary multiplier at is shown generally at 117. 

The structure of register A has a fixed, non- 
dynamic polynomial code and coefficients. The 
5 polynomial code of the static FSR structure of register 
A is not varied in response to the polynomial code 
signal. As shown in Figure 4 the polynomial used for A 
register is: 

Register Structure Poly_A = x A 8 © <x A 212 ® x A 7 © 
10 a A 147 ® x A 6 © a A 194 ® x A 5 © a A 47 ® x A 4 © 0 <8> x A 3 © 
a A 38 ® x A 2 © a A 229 ® x © a A 74, where a(X) = a = X to 
obtain the 8 -tuple j 7 X A 7 + j 6 X A 6 + j 5 X A 5 + j 4 X A 4 + j 3 X A 3 
+ j 2 X A 2 + jiX + j 0 . The values for the individual 
coefficient multipliers are as follows: 
15 (t212 = a A 6 + a A 5 + a A 4 + a A 3 + 1 = [01111001] 
0^147 = a A 5 + a A 3 +1= [0 010100 1] 
ctl94 = a A 5 + a A 4 + a= [0 011001 0] 
ct47 = a A 5 + a A 2 - [0 010010 0] 
0^38 = a A 7 + a A 4 + a A 2 = [1001010 0] 
20 0*229 = a A 6 + a A 5 + a A 4 + a A 3 + a = [0 1111010] 
cf74 = a A 7 + a A 3 + 1= [1 000100 1] 



In the dynamic feedback shift register structures 
B and C (120 and 130) , as shown in Figures 5 and 6 
respectively, data bits are shifted serially in 
25 response to a clock signal, and they are also shifted 



# • 

26 



in accordance with a polynomial code signal 90 
connected from the output stage into the input stage. 
As discussed in connection with Figure 4, a number of 
logic elements are respectively located between 
5 predetermined pairs of register stages of registers B 
and C. The logic elements process a data bit shifted 
from the preceding stage with the data bit feedback 
from the output stage, also in accordance with a 
polynomial code signal 90. 

10 In the B-register and C-register structures, a 

polynomial code signal from the decoding logic unit is 
applied to select one set of polynomial coefficients 
(i.e. a different set of extended Galois Field 
multiplier values) from two sets of polynomial 

15 coefficients. 

For example, in the dynamic feedback register 
structure B shown in Figure 5, data bits are shifted 
serially from each stage in response to a clock signal 
applied to the CLK terminal. Data bytes shifted from 

2 0 the register stages are provided respectively at the 
output terminals RGB 0 through RGB 7 . A number of finite 
field adders 121-127 are respectively located between 
each pair of register stages. One of the inputs to each 
adder 121-127 is provided from the preceding register 

25 stage and the other input to each adder is fed back 
from output terminal RGB 7 of the eighth output stage 
via one of two sets of finite field multipliers (e.g., 
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one set includes finite field multipliers 200, 202, 
204, 205, 206, 208, 210, 212, and 214; the second set 
includes 201, 203, 205, 207, 209, 211, 213, and 215). 
Which set of finite field multipliers is used depends 
5 on the polynomial selected in response to the 

polynomial code 90 (from the decoding logic unit 80 of 
Figure 3) . 

The structure of dynamic feedback shift register B 

shown in Figure 5 uses only two polynomials. The least 
10 significant bit (lsb) of Decoding Logic Unit 80 output 

is used to select one of the following polynomials from 

the dynamic FSR structure B: 

Poly_B0 = x A 8 © <x A 92 ® x A 7 © a A 229 ® x A 6 0 a A 5 ® 

x A 5 © a A 95 ® x A 4 © a A 84 ® x A 3 © 0 ® x A 3 © a A 195 ® x 
15 © a A 176 f where a(X) = a = X to obtain the 8-tuple j 7 X A 7 

+ j 6 X A 6 + j 5 X A 5 + j 4 X A 4 + j 3 X A 3 + j 2 X A 2 + j x X + j 0 . The 

values for the individual coefficient multipliers are 

as follows: 

0^92 = <x A 6 + a A 4 + a A 3 + a + 1 = [0 1011011] 
20 a*229 = a A 6 + a A 5 + a A 4 + a A 3 + a = [0 1111010] 
0^5 = a A 5 = [0 010000 0] 

0^95 = a A 7 + a A 6 + a A 5 + a = [1 110001 0] 
of 84 = a A 6 + a A 5 + a A 3 + a + 1 = [0 1101011] 
of 195 = a A 6 + a A 5 + a A 2 = [0 110010 0] 
25 0^176 = a A 7 + a A 6 + a A 5 + a+ l= [1 110001 1] 
The second polynomial used in the structure B is: 
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Poly_Bl = x A 8 0 a A 22 ® x A 7 © a A 47 ® x A 6 © 0 ® 
x A 5 © a A 230 0 x A 4 © a A 94 ® x A 3 © a A 202 ® x A 2 © a A 28 ® 
x © a A 188. 

The values for the individual coefficient multipliers 
5 are as follows: 

(f22 = a A 7 + a A 6 + a A 5 + a A 3 + a = .[1 1 1 0 1 0 1 0] 

ct47 = a A 5 + a + 1 = [0 010001 1] 

ct230 = a A 7 + a A 6 + a A 5 + a A 4 + a A 2 = [1111010 0] 
c^94 = a A 6 + a A 5 + a A 4 + 1= [0 111000 1] 
10 0^202 = a A 5 + a A 4 + a A 3 = [0 111000 0] 
0^28 = a A 4 + a A 3 = [0 001100 0] 
0^188 = a A 7 + a A 5 + a A 2 + 1= [1010010 1] 

The structure of dynamic feedback shift register C 
(130) is shown in Figure 6. The function of register C 

15 corresponds to that of register B shown in Figure 5, 
with finite field adders 221-227 of register C 
equivalent to the finite field adders 121-127 of 
register B of Figure 5 and finite field multipliers 
230-245 of register C equivalent to finite field 

20 multipliers 200-215 of register B of Figure 5. 

In this example implementation, there are also 
only two polynomials used in the structure of register 
C. The least significant bit (lsb) of Decoding Logic 
Unit 80 output is used to select one out of two 

25 polynomials from the dynamic FSR structure C: 
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Poly_CO = x A 8 © a A 221 ® x A 7 0 a A 18 ® x A 6 © a A 129 
® x A 5 © a A 200 <8> x A 4 © 0 ® x A 3 © a A 124 ® x A 2 © a A 25 ® 
x © a A 31, where a(X) = a = X to obtain the 8 -tuple 
j 7 X A 7 + j 6 X A 6 + j 5 X A 5 + j 4 X A 4 + j 3 X A 3 + j 2 X A 2 + jiX + j 0 . 
5 The values for the individual coefficient multipliers 
are as follows: 

0^221 = a A 6 + <X A 2 + 1= [0 100010 1] 
of 18 = a A 5 + a A 3 + a A 2 + 1= [0 010110 1] 
0^129 = a A 4 + a A 2 + a + 1 = [0 001011 1] 
10 0^200 = a A 4 + a A 3 + a A 2 = [0 001110 0] 

ct!24 = a A 7 + a A 4 + a A 2 + a+ l= [10010111] 
of 25 = a + 1 = [0 000001 1] 
<t31 = a A 7 + a A 6 = [1 100000 0] 

The second polynomial used in the structure of register 
15 C is: 

Poly_Cl = x A 8 © a A 39 ® x A 7 © a A 203 ® x A 6 © 0 <S> 
x A 5 © a A 185 <8> x A 4 © a A 151 ® x A 3 © a A 114 ® x A 2 © a A 7 ® 
x © a A 47. 

The following values for the individual coefficient 

20 multipliers are as follows: 

cf39 = a A 5 + a A 4 + a A 2 + 1 =[0 011010 1] 
0*203 = a A 7 + a A 6 + a A 5 = [1110000 0] 
ctl85 = a A 5 + a A 4 + a A 2 + a+ l= [0 0110111] 
of 151 = a A 7 + a A 5 + a A 3 + a= [l 0101010] 

25 0*114 = a A 5 + a A 4 + a A 3 + a A 2 + a =[0 0111110] 
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(f7 = a x 7 = [1 000000 0] 

0^47 = a^5 + a+ l= [0 01000 1 1] 

Referring back to Figure 3, the polynomial code 
signal 90 that is applied in the dynamic FSRs 12 0 and 
5 13 0 varies in accordance with data bits shifted from a 
predetermined register stage. An F2 table 81 is used in 
the Decoding Logic Unit (DLU) 80 to generate the 
polynomial code signal. 

The decoding logic unit 80 of Figure 3 uses the 

10 least significant bit (lsb) of F2 table output to 

select one of two polynomials from the dynamic FSRs 
120, 130. In this case, 2 different polynomials are 
used in the B-register structure and C-register 
structure, and one polynomial is used in the A- register 

15 structure. Therefore, only one bit in every other pre- 
keystream byte after F2 table substitution is used 
directly to select between the two polynomials, 
POLY_AO(POLY_BO) and P0LY_A1 (P0LY_B1 ) . 

Alternate bytes (8 bits) from the pre-keystream 45 

2 0 are processed to vary the polynomial code signal 90, 
and the remaining bytes of the pre-keystream 45 are 
processed to provide the keystream 100. The bytes that 
cause the polynomial code signal 90 to be varied are 
not included in the keystream 100. 

25 As discussed above in connection with Figure 2, in 

the embodiment described in Figure 3, the multiplexers 
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170-173 use the output from predetermined register 
stages of register C to select between data bytes 
shifted from different predetermined register stages of 
registers A and B to provide some of the inputs to the 
5 Fl tables (e.g., Fl tables 150, 153, 154, and 157). 
Each multiplexer 170-173 is a 2:1 multiplexer. For 
example, two output bits (RGA 0 and RGA 7 ) are shifted 
directly from different stages of register A to 
multiplexer 170. One of RGA 0 and RGA 7 is selected by 

10 multiplexer 170 in response to a data bit shifted from 
register C (RGC 6 ) . Similar operations take place at 
multiplexers 171, 172, and 173. The output of 
multiplexers 170, 171, 172, and 173 is provided to Fl 
tables 150, 153, 154, and 157 respectively. 

15 Detailed multiplexer settings for multiplexers 

(MUX) 170 and 172 are shown in the table below. 
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Table 2: MUX 170 and 172 Settings 





SEL 


A 


B 


Descriptions 


MUX 
170 


RGC 6 


RGA 0 


RGA 7 


Output i = RGA 0 _i if RGC 6 _i = 0 
Or = RGA 7 _i if RGC 6 _i = 1 

Where i = 0, 1, 7 


MUX 
172 


RGC 6 


RGAi 


RGA 6 


Output i = RGAi_i if RGC 7 _i = 0 
Or = RGA 6 _i if RGC 7 _i = 1 

Where i = 0, 1, 7 



Detailed multiplexer settings for multiplexers 
(MUX) 171 and 173 are shown in the table below. 



5 Table 3: MUX 171 and 173 Settings 





SEL 


A 


B 


Descriptions 


MUX 
171 


RGC 0 


RGBi 


RGBs 


Outputi = RGBi_i if RGC 6 _i = 0 
Or = RGB 6 _i if RGC 6 _i = 1 

Where i = 0, 1, 1 


MUX 
173 


RGC 0 


RGB 0 


RGB 7 


Outputi = RGB 0 _i if RGC 7 _i = 0 
Or = RGB 7 _i if RGC 7 _i = 1 

Where i = 0, 1, 7 



Multiplexers may also be used to provide the 
address signal to the S-Box substitution function 165. 
For example, the inverse of MUX 170 (MUX 170" 1 ) is used 
to select alternate bits from RGA 0 and RGA 7 . MUX 170" 1 
10 is defined in the following table: 
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Table 4: MUX 170" 1 Settings 





SEL 


A 


B 


Descriptions 


MUX 
170- 1 


RGC 6 


RGA 7 


RGA 0 


Output i = RGA 7 _i if RGC 6 _i = 0 
Or = RGA 0 _i if RGC 6 _i = 1 

Where i = 0, 1, 7 



The inverse of MUX 173 (MUX 173" 1 ) is used to 
select alternate bits from RGB 0 and RGB 7 . MUX 173 _1 is 
defined in the following table: 



5 Table 5: MUX 173" 1 Settings 





SEL 


A 


B 


Descriptions 


MUX 
173" 1 


RGC 0 


RGB 7 


RGB 0 


Output i = RGB 7 _i if RGC 6 _i = 0 
Or = RGA 0 _i if RGC 6 _i = 1 

Where i = 0, 1, 7 



Two of eight register outputs from. each shift 
register are tapped as direct inputs to the Fl tables 
151, 152, 155, 156. No tap is used more than once. 
Input to the remaining Fl tables 150, 153, 154, and 157 
10 are from the multiplexers 170, 171, 172, and 173 
respectively. 

The non- linear mix functions 161, 162 used to 
combine two bytes shifted directly from the register A 
or register B (RGA2/RGA4 , RGB3/RGB5) with others that 
15 are from the Fl tables: 

gO = RGA' 0 |7 <8> RGB'iie 0 RGA 2 <8> RGB' 4 © RGA' 5 <8> RGB 3 

gl = RGA'i| 6 <8> RGB' 0 |7 + RGA4 ® RGB' 2 + RGA' 3 ® RGB 5 
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The RGA' 0 |7 is a field element constructed from 
selecting 8 bits from RGA 0 and RGA 7 based on RGC 6/ and 
then passing the selected bits to the Fl table 150. By 
the same definition, RGB'i| 6 is a field element 
5 constructed from selecting 8 bits from RGBi and RGB 6 
based on RGC 0 , and then passing the selected bits to 
the Fl table 153. The RGA'i| 6 is a field element 
constructed from selecting 8 bits from RGAi and RGA 6 
based on RGC 6 , and then passing the selected bits to 

10 the Fl table 172. RGB' 0 |7 is a field element constructed 
from selecting 8 bits from RGB 0 and RGB 7 based on RGC 0/ 
and then passing the selected bits to the Fl table 157. 

The two non- linear functions gO and gl set forth 
above are constructed in accordance with the following 

15 rules: 

• Terms multiplied together shall be from different 
register structures. 

• At least one of two multiplication terms shall be 
from the Fl tables. 

20 Referring to Figure 3, the 8*8 S-Box 165 is used 

to provide individual pre-keystream bytes 45 and 
codestream bytes 48 respectively from output terminal 
Yl and Y0 in accordance with the contents of the output 
from the non- linear mix functions 161, 162 and two 

25 bytes of data from the A and B register structures. The 
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entries of the S-Box are a permutation of the numbers 0 
through 255, and the address of the S-Box is a function 
of loop counter n, gO, gl, RGB 0 | 7 / and RGA 0 | 7 . 

The S-box is initialized linearly to S 0 = 0, Si = 
5 1, S255 = 255. The S-box permutation process can be 
formulated in accordance with the following 
relationships : 



The S-Box slowly evolves with the iterations of n to 
ensure that every element changes, and also with the 
use of (gO, gl, RGB 0 | 7 / RGA 0 | 7 , Y0 ' , Yl ■ ) to ensure that 

2 0 the elements change randomly. 

The Codestream Register 175 receives codestream 
bits 4 8 from the Yl output terminal of the S-Box 165 in 
response to the address signal from the A and B 
structures, and also from the non- linear mix functions 

25 161, 162. The codestream bytes 48 are clocked through 
the Codestream Register 175, and added by a non-binary 
adder (XOR) to the data bytes (RGB 7 , RGC 7 ) shifted from 
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n from 0 to runup_Cycles 
n = (n % 256) ; 

Y0' = (gO + S RG B7| 0+ S n ) mod 2 56 
swap Si and S Y o' 

Yl' = (gl + S rg A7 |o + S n ) mod 256 

swap Si and S Y i- 

tO = (S g i+ S Y0 ') mod 256 

tl = (S g0 + Syi-) mod 256 



Y0 = S t0 and Yl = S t i 
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the MSB output stage (stage 7) of register B and 
register C. This process provides a feedback data 
byte, which is then fed back both to the input stage 
(stage 0) and to predetermined intermediate stages of 
5 registers B and C in accordance with the applied 
polynomial code 90. 

Keystream bytes 100 from the Keystream Register 60 
are produced based on the control clock, CLOCK. These 
bytes are assembled to form the 7 bytes or 56 bits of 

10 output only after the algorithm has had adequate mixing 
time to circulate data among the FSRs. The amount of 
mixing time is measured in the number of clock cycles 
of CLOCK between the time FSRs A, B, and C are 
initialized, and the time that the Keystream Register 

15 60 byte output is stored. The initialization takes 
place at time T=0, where time is measured in CLOCK 
cycles. After initialization, 701 CLOCK cycles may 
occur before the first byte of Keystream Register 60 
output is stored. Thereafter, output bytes may be 

20 stored after the completion of the following additional 
cycles : 
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Table 6: CLOCK Cycles from Initialization to Output 

Storage 



CLOCK 
Cycle 
Number 


Event 


CLOCK Cycle 
Number 


Event 


0 


Initialize FSR 
A, B, and C 


701 + 126 = 
827 


Store output 
byte 3 


701 


Store output 
byte 0 


701 + 142 = 
843 


Store output 
byte 4 


701 + 30 = 
731 


Store output 
byte 1 


701 + 580 = 
1281 


Store output 
byte 5 


701 + 70 = 
771 


Store output 
byte 2 


701 + 636 = 
1337 


Store output 
byte 6 



The embodiments of the invention described above 
5 are selected preferred embodiments and are not intended 
to provide a limitation on the scope of the invention. 
It should be appreciated that many variations of the 
implementations described above are available. For 
example, the invention described herein can be 

10 implemented using a wide variety of input key sizes. In 
addition, various Galois field selections may be used 
to implement the invention. The number of static and/or 
dynamic feedback shift registers used may vary, as well 
as the number of register stages for each such shift 

15 register. The number of randomization stages may vary, 
as well as the functionality contained within each 
stage (e.g., number of multiplexers, number of S-Boxes, 
number of randomization tables, and the like) . Other 
implementation complexities can be varied or changed 
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consistent with the general description of the 
invention described above. 

It should now be appreciated that the present 
invention provides improved methods and apparatus for 
5 generating a cryptographically robust keystream for use 
in data encryption and decryption. 

Although the invention has been described in 
connection with various illustrated embodiments, 
numerous modifications and adaptations may be made 
10 thereto without departing from the spirit and scope of 
the invention as set forth in the claims. 



